rank | frequency | n-gram |
---|---|---|
1 | 114807 | -а |
2 | 77283 | -е |
3 | 71504 | -и |
4 | 53626 | -т |
5 | 48678 | -о |
rank | frequency | n-gram |
---|---|---|
1 | 43229 | -те |
2 | 36463 | -та |
3 | 32282 | -от |
4 | 19574 | -то |
5 | 18950 | -ни |
rank | frequency | n-gram |
---|---|---|
1 | 32822 | -ите |
2 | 29188 | -ата |
3 | 14566 | -иот |
4 | 9642 | -ото |
5 | 8893 | -ски |
rank | frequency | n-gram |
---|---|---|
1 | 10481 | -ните |
2 | 8918 | -ната |
3 | 7802 | -ниот |
4 | 7112 | -ката |
5 | 5764 | -ките |
rank | frequency | n-gram |
---|---|---|
1 | 4265 | -ањето |
2 | 3745 | -ијата |
3 | 3654 | -ување |
4 | 3297 | -ската |
5 | 3222 | -скиот |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings